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Abstract —Phase balancing is essential to safe power system 
operation. We consider a substation connected to multiple phases, 
each with single-phase loads, generation, and energy storage. 
A representative of the substation operates the system and 
aims to minimize the cost of all phases and to balance loads 
among phases. We first consider ideal energy storage with 
lossless charging and discharging, and propose both centralized 
and distributed real-time algorithms taking into account system 
uncertainty. The proposed algorithm does not require any system 
statistics and asymptotically achieves the minimum system cost 
with large energy storage. We then extend the algorithm to 
accommodate more realistic non-ideal energy storage that has 
imperfect charging and discharging. The performance of the 
proposed algorithm is evaluated through extensive simulation and 
compared with that of a benchmark greedy algorithm. Simulation 
shows that our algorithm leads to strong performance over a wide 
range of storage characteristics. 

Index Terms —Distributed algorithm, energy storage, phase 
balancing, stochastic optimization. 

I. Introduction 

In North America, many residential customers are connected 
to distribution systems through single-phase lines. Phase bal¬ 
ancing, i.e., maintaining the balance of loads among phases, 
is crucial for power grid operation ID. This is because phase 
imbalance can increase energy losses and the risk of failures, 
and can also degrade system power quality. With the spread 
of single-phase renewable generators, such as wind and solar 
generators, and large loads, such as electric vehicles, phase 
imbalance could be aggravated and thus deserves more careful 
study. For example, the impact of integration of electric 
vehicles on phase imbalance was investigated in 0. 

Previous works on phase balancing have considered meth¬ 
ods such as phase swapping (e.g., 0) and feeder reconfigura¬ 
tion (e.g., 0). However, these approaches can be ineffective 
or can incur extra costs on human resources, maintenance 
expenses, and planned outage duration 0 - An alternative 
method is to employ energy storage to mitigate the imbalance 
among phases, which is the focus of this paper. 

Energy storage has been used widely in power grids for 
applications such as energy arbitrage, regulation, and load 
following 0- Examples of single-phase storage include: 

• Traditional standalone storage such as batteries, fly¬ 
wheels, etc 0- 
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• Batteries in single-phase connected buildings such as 
plug-in electric vehicles 0. 

• Aggregations of small single-phase deferrable loads, e.g., 
residential thermostatically controlled loads or electric 
vehicle garages, which have been shown to be repre¬ 
sentable as equivalent storage 0-Qo). 

The control of energy storage for power grid applications 
is, however, generally a challenging problem due to storage 
characteristics as well as system uncertainty. There are many 
existing works on storage control in power grids. For example, 
using stochastic dynamic programming, the authors of CD 
proposed a stationary optimal policy for power balancing, and 
the authors of CD investigated both optimal and suboptimal 
polices for energy balancing. Nevertheless, the derivation of an 
optimal policy under dynamic programming generally relies 
on system statistics and some specific form of the problem 
structure, and therefore it cannot be easily extended. Similarly, 
the authors of CD considered stochastic model predictive 
control. However, the algorithm performance can only be 
evaluated through numerical examples. 

Besides the above two approaches, several recent works 
have employed Lyapunov optimization d for energy stor¬ 
age control. In particular, the authors of CD investigated a 
power-cost minimization problem in data centers with energy 
storage, and were the first to use the technique of Lyapunov 
optimization for real-time storage control. The technique was 
then employed in several subsequent works to design energy 
storage control for various applications in grid operation, 
such as power balancing ED, ED, demand side management 
ED, ED, and EV charging ||20l . Furthermore, the authors 
of ED analyzed the trade-off between averaging out the 
energy fluctuation across time and across space, the authors 
of l22l studied generalized storage control with general cost 
functions, and the authors of ll23l investigated the management 
of networked storage with a DC power flow model. Among 
these works, single storage control was considered in CD, 
CD, and l22l . For multiple storage control, charging efficiency 
was incorporated into the storage model in 1201 . both charging 
and discharging efficiencies were introduced in CD, and 
storage efficiency that models the energy loss over time was 
included in ED- 

In this paper, we study the problem of phase balancing with 
energy storage in the presence of system uncertainty. Unlike 
prior works such as l24l and l25l that focus on heuristic 
algorithms for storage control in phase balancing, in this 
paper, we provide efficient algorithms with strong theoretical 
performance guarantee. We consider a substation connected 
to multiple phases, each with single-phase uncontrollable 
flow, controllable flow, and energy storage. In particular, 
we consider phase balancing on a time scale of seconds to 
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minutes. As such, we do not model power system physics such 
as frequency and voltage magnitude. Aiming at minimizing 
the cost of all phases and mitigating phase imbalance, we 
propose a real-time algorithm that can be easily implemented 
by the substation. Moreover, for the likely scenario of limited 
communication between the substation and each phase, we 
provide a distributed implementation of the real-time algorithm 
where only limited information exchange is required. 

The main contributions of this paper are summarized as 
follows. First, we formulate a stochastic optimization problem 
for phase balancing incorporating system uncertainty, storage 
characteristics, and power network constraints. Second, for 
ideal energy storage with lossless charging and discharging, 
we provide a real-time algorithm building on the framework 
of Lyapunov optimization and prove its analytical performance 
guarantee. Moreover, we offer distributed implementation of 
the algorithm with fast convergence. Third, we extend the 
algorithm to accommodate non-ideal energy storage with 
imperfect charging and discharging efficiency and show its 
analytical performance. Finally, to numerically evaluate the 
performance of the proposed algorithm, we compare it with 
a benchmark greedy algorithm under various settings and 
parameters. Simulation reveals that our proposed algorithm is 
competitive in general. In particular, the proposed algorithm 
has strong performance when applied to storage with a large 
energy capacity, a high value of the energy-power ratio (e.g., 
compressed air energy storage and batteries), and moderate- 
to-high charging and discharging efficiency (e.g., the round- 
trip efficiency of storage is greater than 65%). In addition, a 
practical outcome of our analysis shows the following design 
guideline: optimal power balancing favors even allocation of 
storage capacity over the phases. 

Our paper is technically most similar to |[23l . in which a 
distributed real-time algorithm is proposed for power grids 
with energy storage. However, these two papers are different 
in terms of the application, objective, communication topology, 
and power network constraints. Hence, the problem formula¬ 
tion and the design of distributed implementation are largely 
different. Moreover, for analysis, charging and discharging 
efficiencies were not considered in the storage model in 
|f231 . While the authors stated that their framework could 
further incorporate imperfect charging and discharging, no 
implementable algorithm was given to address that. In contrast, 
in this paper, we provide an efficient algorithm to deal with 
imperfect charging and discharging. 

A preliminary version of this work has been presented in 
ll26l . In this paper, we significantly extend ll26l in two ways: 
analytically, for the proposed algorithms, we provide more 
in-depth performance analysis for both ideal and non-ideal 
storage; numerically, we implement extensive simulation by 
examining various storage characteristics and the effect of 
correlation between the phases’ random power imbalances. 

The remainder of this paper is organized as follows. In 
Section|IIl we describe the system model and formulate the op¬ 
timization problem. In Section|IIIl we propose both centralized 
and distributed real-time algorithms for ideal energy storage. 
In Section IIVI we extend the algorithm to accommodate non¬ 
ideal energy storage with imperfect charging and discharging. 



Fig. 1. System model with N phases. The details of the z-th phase are 
shown. 

Numerical results are presented in SectionjVJ and we conclude 
in Section [VI] 

II. System Model and Problem Statement 

Consider a discrete-time model with time f£{0,l,2,...}. 
To simplify notation, we normalize the duration of each time 
period At to one and thus eliminate At in presentation. 
The system model is depicted in Fig. Q] A substation is 
connected with N > 2 phases, each with single-phase loads 
and generation^ We consider a general case where it is 
optional for each phase to deploy energy storage. Denote the 
set of phases that deploy storage by £ C {1, 2,..., N}. Below 
we first describe the components of each phase. 

A. System Model of Each Phase 

At the i-th phase, denote the amount of uncontrollable 
power at time slot t by r, ;t . The uncontrollable flow can 
represent renewable generation such as wind and solar, base 
loads, or the difference between renewable generation and base 
loads. Since the uncontrollable flow is generally governed by 
nature or uncertain human behavior, we assume that r t , t is 
random, but it is confined within an interval [r^mm, ri, max ]. 
Throughout the paper we use a bold letter to denote a 
vector that contains elements of N phases. Here, we define 
r t—..., ry,t] to represent the uncontrollable flow vector 
at time slot t. The other vectors in the rest of this paper are 
defined similarly. 

Denote the amount of the controllable power flow at the i-th 
phase at time t by The controllable flow can represent the 
output of conventional generators, or the consumption of flex¬ 
ible loads. We associate a cost function with the controllable 
flow and denote the function by Ci{U it ), which can represent 
the cost of local generators (e.g., an on-site diesel generator), 
or the cost of a utility for consuming power. 

Denote the power flow between the substation and the i-th 
phase at time slot t by fij. Due to the capacity constraints of 
power lines, the value of /, : t is generally confined. We assume 
that at each time slot the power flow vector f t £ J 7 , where the 
set T is non-empty, compact, and convex. For example, T 
may be defined as .F={f t |/ M £ [/i,min,/i,max], Vi}. 

Remark: The values of r^t, h,t, and can be positive or 
negative. We use the positive sign to indicate power injection 

1 In practice, a typical substation consists of one or more three-phase feeders 
connected through a feeder breaker, each of which can supply multiple single¬ 
phase loads. In this paper, since we focus on the problem of phase balancing 
in one feeder, the structure of feeders is omitted in Fig. 1. For a more thorough 
description of a distribution substation, please see Chapter 1 in G3 
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into the i-th phase, and the negative sign to indicate power 
extraction from the i-th phase. 

Assume that the i-th phase is equipped with an energy 
storage unit, i.e., i £ £. Denote the charging and discharging 
rates of the storage at time slot t by uf t £ [0,Uj imax ] and 
u~ t £ [0, it^max], respectively, where Ui ^ max is the maximum 
charging and discharging rates. Denote the energy state of 
the i-th storage at the beginning of time slot t by s^t, 
which evolves as Sj jt+ i = + uf t — u~ t . The energy 

state s,; >t is required to be within the storage’s capacity limits 

[^> 2 ,min; 

Due to conversion and storage losses, charging and dis¬ 
charging may not be perfectly efficient. For the i-th storage, 
we denote the charging efficiency by 77 + £ ( 0 , 1 ] and the 
discharging efficiency by rj~ £ (0,1]. Then, the associated 
charging and discharging quantities seen on each phase are 
-rpuf t and rj~u~ t , respectively (see Fig. ©. Owing to the 
round-trip efficiency or other operating constraints, simultane¬ 
ous charging and discharging may be forbidden in practice, 
which can be reflected by the constraint uf t ■ u~ t = 0, i £ £. 
Moreover, if the i-th phase is not equipped with storage, i.e., 
i £ £, we simply set the values of Si } t,uf v and u~ t to zero. 

The energy storage can additionally be used for arbitrage^ 
Denote the electricity price at time slot t by p t £ [PmimPmax], 
which is random over time. Then the cost of the i-th phase 
for energy arbitrage during time slot t is p t (-^ruf t — Ti~u~ t ). 
Finally, frequent charging and discharging can shorten the 
lifetime of storage [28]. To model this effect, we introduce a 
degradation cost function Di(-), with negative input indicating 
discharging and positive input indicating charging. Therefore, 
the degradation cost incurred at time slot t is given by 
Di{y ^ t ) + Di{—u it ) H 


B. Problem Statement 

Since phase imbalance is harmful for power system opera¬ 
tion, it is critical to balance the power flows /j jt among phases. 
To this end, we introduce a loss function F(-) to characterize 
the deviation of f l: t from the average power flow. In particular, 
for the i-th phase, F(-) is a function of f i t — f t , where f t is 
the average defined as J t =jf EyLi fj,t- 

We assume that the system is operated by a representative 
of the substation, who aims to minimize the long-term system 
cost, which includes the costs of all phases. Specifically, based 
on the model described in Section lTl-AI the system cost at time 
slot t is given by 


w t =Y,ieS Pt(^F u tt - Vi u i,t ) + 


+ [Ci(li,t)+F(f i>t -f t )]. 


2 Energy storage is still expensive based on the current technology. There¬ 
fore, in practice, besides phase balancing and energy arbitrage, energy storage 
can be used to provide other grid-wide services, e.g., volt-var control, load 
following, and peak shaving. 

3 Accurate modeling of battery degradation is highly complicated and is an 
active research area. In this paper, to focus on storage control, we employ 
a simplified degradation model, which is a function of the charging and 
discharging amount. 


Denote the random system state at time slot t by q t =[r t ,p t ], 
which includes the uncontrollable power flow of N phases and 
the electricity price. Denote the control action at time slot t 
by at=[lt, U(“, u^“, ft], which contains the controllable power 
flow, the charging and discharging amounts, and the power 
flow between each phase and the substation. We formulate 
the problem for phase balancing as the following stochastic 
optimization problem. 

1 T_1 

PI: min limsup— > E[iu$] 

{a*} 00 T “ 

S.t. 0 ^ — ^i, max? ^ ^ £-> ^5 

uf t ' u 7t = 0, Vz G £, t , 

5 M +1 = s i,t ^ ^’ ^5 

^i,min — $i,t — &i, maxi^ ^ £->^t 

uZ tt = u+t = 0,Vi£E,t, 
ft £f,Vt, 

fi,t + n,t + k,t + v7 Uit —+ u tt = 1 

Vi 

The expectation on the objective is taken over the randomness 
of q t and the possibly random control action that depends on 
q t . Constraint ([7} enforces power balance at each phase at 
each time slot. 

To keep mathematical exposition simple, we assume that 
the cost functions C 7 ( ■) and D, (■ ) are continuously differen¬ 
tiable and convex. This assumption is realistic because many 
practical cost functions are well approximated this way (29]. 
In particular, by convexity, the marginal cost is increasing. 
With the objective of minimizing the system cost, for the 
function C(-), this property discourages excessive use of the 
controllable flow. For battery degradation, it is understood that 
faster charging or discharging has a more detrimental effect 
on the battery lifetime, and the convexity of D{-) reflects this 
behavior. Denote the derivatives of Ci(-) and Di(-) by C'(-) 
and -D'(-), respectively. Since the variables uf t ,u~ t , and (j jt 
are bounded based on the constraints of PI, the cost functions 
and their derivatives are bounded in the feasible set. For the 
cost function Ci(-), we denote its range by [Ci iin i n , C, : max ] and 
its range of the derivative by [C[ min , C[ max ] in the feasible 
set. The range of the cost function Di{-) and that of its 
derivative are defined similarly. In addition, we assume that the 
loss function F(-) is convex and continuously differentiable. 

We are interested in designing both centralized and dis¬ 
tributed real-time algorithms for solving PI. Distributed im¬ 
plementation is motivated by the limited capability of real¬ 
time communication between the substation and each phase, 
and also the potential privacy concerns of each phase. This 
is a challenging task due to system uncertainty, the coupling 
of all phases through the objective and constraints, and the 
energy state constraint © which couples the charging and 
discharging actions over time. In addition, we assume that 
the system is not equipped with any forecaster and only 
has historical information of the system states. Designing 
appropriate forecasters and incorporating forecast into optimal 
control are important directions and are left for future work. 


(1) 

( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 
(7) 
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III. Real-Time Algorithm for Ideal Energy 
Storage 

For tractability, in this section we first consider ideal energy 
storage that has perfectly efficient charging and discharging, 
i.e., rjf = r]~ = 1. The case of non-ideal energy storage 
is studied in Section [TV] We first propose a centralized real¬ 
time algorithm that can be implemented by the substation and 
show its analytical performance. Then we provide distributed 
implementation for the proposed algorithm where only limited 
information exchange is needed. 

A. Centralized Real-Time Algorithm and Analysis 

Under perfectly efficient charging and discharging, without 
loss of generality, we can combine the charging and discharg¬ 
ing variables uf t and u~ t into one by introducing a new 
variable — u~ t , which can represent the net charging 

and discharging amount. In particular, if u^t > 0 it indicates 
charging, and if u^t < 0 it indicates discharging. 

With the new variable the non-simultaneous charging 
and discharging constraint © can be eliminated, and the 
evolution of the energy state amounts to = .Sj t + 

In addition, with the control action at time slot t is 
now at=[l t , u t , f f ], and the system cost can be rewritten as 
w t = \PtUi,t + Di(ui t t)\ +X]i=l \pi{h,t)+F(fi,t — ft)] ■ 

For the design of real-time implementation, we employ 
Lyapunov optimization m, which has been used widely in 
wireless networks for dealing with time-averaged constraints 
and providing simple yet efficient algorithms for complex 
dynamic systems. However, the energy state constraint © is 
not a time-averaged constraint but a hard constraint, and it 
couples the control action u, j, over multiple time instances. 
As a result, PI is not amenable to the standard framework 
of Lyapunov optimization. To overcome this difficulty, we 
replace the energy state constraints © and © with a new 
time-averaged constraint, which only requires the net charging 
and discharging amount to be zero on average, i.e., 

1 T_1 

lim — 'Y' E[tt it ] = 0,Vi e E. (8) 

t—0 

With the new constraint ©. we form a new stochastic opti¬ 
mization problem as follows: 



1 T_1 

lim sup — V E[w t ] 

T^oo i ^ 


min 

f a t} 


S.t. 

©. ©, 



fi,t + Vij “b li,t tli^t — 0, Vf, f, 

(9) 


tli : t — 0,Vz 

(10) 


^ti,max A tli^t A t/2 jIna x,Vi £ S,t. 

(11) 


It can be shown that constraints © and © imply ©, i.e., any 
Ui t t that satisfies © and © also satisfies ®. Hence, P2 is a 
relaxation of PI (see Appendix [All. 

The above relaxation step is crucial for applying Lyapunov 
optimization. However, we emphasize that, solving P2 is not 
our purpose (it is clear that, due to the relaxation of constraints 
© and ©, a solution to P2 may be infeasible to PI). 


Instead, the significance of proposing P2 is to facilitate the 
development of a real-time algorithm for PI and the associated 
performance analysis. Later we will prove in PropositionQ]that 
our proposed algorithm ensures that constraints © and © are 
satisfied, and therefore produces a feasible solution to PI. 

We now propose a real-time algorithm leveraging Lya¬ 
punov optimization techniques. At time slot t, for phase 
i £ £, define a Lyapunov function L(sj i t)=^(sj,t — Si) 2 , 
which measures the deviation of the energy state Si it 
from a perturbation parameter Si. The parameter Si is in¬ 
troduced to ensure the boundedness of the energy state, 
i.e., constraint ©, and it needs to be carefully designed. 
In addition, we define a one-slot conditional Lyapunov 
drift as A(s t )=E[^A g c " pSiHi<1 ' | St ], which collects 

a weighted sum of the one-slot conditional drifts of the 
Lyapunov functions for all phases with storage. 

In our design of the real-time algorithm, instead of directly 
minimizing the system cost w t , we consider a drift-plus-cost 
function A(s t ) +E[wt|s t ]. In particular, we first derive an up¬ 
per bound on the drift-plus-cost function (see Appendix [B] for 
the upper bound), and then formulate a per-slot optimization 
problem to minimize this upper bound. Consequently, at each 
time slot t, we solve the following optimization problem: 


P3: 


min 

at 


w t 


(Si,t — Si) u i,t 


Vi 


s.t. 


©-©,£lQi- 


Denote an optimal solution of P3 at time slot t by 
sf =[1(, Uj, f t *]. At each time slot, after obtaining the solution 
a]f, we update using u* t . It can be easily verified that the 
optimization problem P3 is convex, and thus can be solved by 
standard convex optimization software packages such as those 
in MATLAB. We will later shown in Theorem Q] that such 
design of the per-slot optimization problem leads to certain 
guaranteed performance. 

In the proposition below, we show that, despite the relax¬ 
ation to arrive at P2, by appropriately designing the perturba¬ 
tion parameter Si, we can ensure that constraint © is satisfied, 
and therefore the control actions {a)} are feasible to PI. 

Proposition 1: For phase i £ £ , set the perturbation param¬ 
eter Si as 


Vi(j) B 


D' 


C<) ( 12 ) 


where Vi e (0, V^max] with 


V A - 

v i, max— 7 


n —2 Ui 


—D'. 


(13) 


Then the control actions (a)} obtained by solving P3 at each 
time t are feasible to PI. 

Proof: See Appendix |C] ■ 

To ensure the positivity of L, ln . lx in ©. we need the 
numerator s iiinax - s iimin - 2u ijlnax > 0. This is generally 
true for real-time applications, in which the length of each 
time interval is small ranging from a few seconds to minutes. 

The overall centralized real-time algorithm is summarized 
in Algorithm [T| which can be implemented by the substation. 
It is worth mentioning that the proposed algorithm does not 
require any system statistics, which may be desirable when 
accurate system statistics are difficult to obtain. 
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Algorithm 1 Centralized algorithm for ideal storage. 

At time slot t, the substation executes the following steps 
sequentially: 

1: observe the system state q f and the energy state 
2: solve P3 and obtain a solution a*=[1 ^, f * ]; and 
3: update s itt +i by s t , t + u* t . 


Denote the optimal objective value of PI by u> opt . Under 
Algorithm [Q denote the objective value of PI by w* and 
the system cost at time slot t by uf. The performance of 
Algorithm U] is stated in the following theorem. 

Theorem 1: Assume that the system state cp is i.i.d. over 
time and the equipped storage at the phases is perfectly 
efficient. Under Algorithm [I] the following statements hold. 

1) w *- w °pt<j: ies ?%r. 

2) * EE 1 e KI ^ ™ opt < Eief %f + ^vr 1 - 

Proof: See Appendix IdI ■ 

Remarks: 

• For Theorem Q]l, first, if £ is empty, i.e., no phase 
deploys storage, then Algorithm [j] achieves the optimal 
objective value. In fact, for this case, Algorithm|T]reduces 
to a greedy algorithm that only minimizes the current 
system cost at each time. Second, if £ is non-empty, 
to minimize the gap to the optimal objective value, we 
should set Vj = Vj. lnax . Asymptotically, if the energy ca¬ 
pacity spmax is large and thus I4,max is large. Algorithm 
D] achieves the optimal objective value Q 

• In Theorem Q]2, we characterize the performance of 
AlgorithmQ]over a finite time horizon. The result not only 
shows the performance gap of the algorithm over a finite 
time T, but also reveals how the gap converges asymptoti¬ 
cally to the one in TheoremQ] 1 as T grows. It can be seen 
that, the gap contains a component Eief E ^y.’°^ due 
to the initialization of the energy states, which linearly 
decreases with the time horizon T. 

• The i.i.d. assumption of the system state q 4 can be 
relaxed to accommodate q* that follows a finite state 
irreducible and aperiodic Markov chain. Using a multi¬ 
slot drift technique CD, we can show similar conclusions 
which are omitted here. In simulation, we will evaluate 
the algorithm performance when the uncontrollable power 
flows are temporally correlated. 

An interesting additional consequence of Theorem Q] is 
that we obtain a general rule of thumb for the allocation of 
energy storage capacity among the phases. In particular, in the 
following proposition, we demonstrate that, under some mild 
conditions, equal allocation of a given energy storage capacity 
results in a lower overall system cost. 

Proposition 2: Assume that and Di max — 

/T, lnin are identical for all i £ £. Assume further that for all 
phases, C' max — C[ min is the same. Then, under Algorithm 
OH if the total energy storage capacity E;e£ *3. max is fixed 

4 The choice of Vi = V, max and the asymptotic optimality are based on the 
linear storage model in {3} These conclusions need to be re-examined when 
a more general storage model with other factors such as storage efficiency is 
considered B3, (23|. 


and the control parameter Vi = V^ max is as in (fl3l >. the 
upper bound of the performance gap in Theorem Q]l, i.e., 

Ez££ 2 v“ x ’ ' s minimized when the energy storage capacity 
is equally allocated among phases. 

Proof: See Appendix [E] ■ 

The above result states that energy storage is best allocated 
equally over the phases. Note that this result is robust because 
it does not depend on any system statistics or specific values 
of system parameters. We will revisit this in simulation. 

In this paper, as our focus is on designing real-time algo¬ 
rithms for storage control, we use a stylized system model 
shown in Fig. [I] In particular, we do not model the network 
structure at each phase, and therefore do not consider how to 
place storage. Instead, we assume a given arbitrary deployment 
of storage at any location of each phase. Nonetheless, we point 
out that storage placement affects the investment strategy of 
power systems and is crucial for grid operation. This problem 
has attracted considerable attention and has been investigated 
in many papers (e.g., If30l ). In particular, in the case of physical 
storage, the authors of f30l proved that, under some technical 
conditions, there always exists an optimal strategy of storage 
placement that assigns zero storage at generation-only buses 
that link to the rest of the network via single transmission 
lines. Moreover, if storage is load aggregation, then it can 
be distributed over the phase. How to extend our results to 
consider storage placement is a topic for future study. 


B. Distributed Implementation of Centralized Algorithm 

To accomplish the implementation of Algorithm Q] in a 
centralized way, each phase has to provide all information that 
is required to solve the real-time problem P3. Specifically, for 
each phase, the cost functions and the associated optimization 
constraints need to be communicated to the substation in 
advance. In addition, at each time slot, the information of 
the uncontrollable power flow as well as the storage energy 
state has to be sent to the substation. However, in practice, 
due to the limited capability of real-time communication 
along with potential privacy concerns of each phase, some 
of the aforementioned information may be unavailable at 
the substation. Therefore, the centralized implementation may 
be infeasible. In this subsection, we provide a distributed 
algorithm for solving P3 in which only limited information 
exchange is required. For ease of notation, we suppress the 
time index t in the following presentation. 

The distributed algorithm is based on the alternating di¬ 
rection method of multipliers (ADMM) ED. To facilitate 
algorithm development, we rewrite P3 as follows: 

N 

min l(f £ JF) + E 

i=l 

s.t. fi + n + k - m = 0,Vi (14) 

where l(-) is the indicator function that equals 0 (resp. +oo) 
when the enclosed event is true (resp. false), and for each 
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Substation 

(update f fc+1 ) 


Phasei 

(update i* +1 ,tij !+1 , and A* +1 ) 


Fig. 2. Distributed implementation for solving P3. 
phase the function Hi(li,Ui) is defined as follows: 

f ^ + V u i + Di(lli ) + Ci{li) 

\ Tl( tXi ima x ^ 'Uj A: tX^ max ), if i £ £ 


Algorithm 2 Centralized algorithm for non-ideal storage. 

At time slot t, the substation executes the following steps 
sequentially: 

1: observe the system state q t and the energy state S; it ; 

2: solve P3’ and obtain an intermediate solution 


a t =[i t ,u+,u t , ft]; 

3: generate the final solution a| where u. 


t> u i,t = m a x{u iit - u+ t , 0}, 


+ * 


= max{'ii+ - 


v r U i,t ~ Vi U i,t + ~J rU : 

4: update Si t t by © using u. 


lit = kt 

and f t * = f t ; and 
+ * and u~* 


Vi u, t 


i,t 


i,t ■ 


[Ci{li) + l(ui = 0), if i££. 

We associate a Lagrange multiplier X, with equality (fl4t . 

By treating the variables (1, u) as one block and the variable 
f as the other, we express the updates at the (fc+l)-th iteration 
below according to the ADMM algorithm. 

(U,Ui) k+1 «- argmin I Hi(h,Ui) + ?-(f k + n + h - Ui + — ) 2 1 
L Z p J 

f fc+1 argmin £ [f(/ 4 - /) + |(/ 4 + n + l k+1 - u k+1 + ^) 2 ] 
i= 1 P 

\ fc+1 \k , / nk -\-1 I I jfc + 1 fc + l\ 

Ai ^ A i +p{fi + n + - «i ) 


and it i t cannot be combined into one as we did in Section 
uni and therefore, the (non-convex) non-simultaneous charg¬ 
ing and discharging constraint © cannot be eliminated. To 
overcome this difficulty, we first ignore constraint © and then 
adjust the resultant solution to satisfy the constraint. 

Specifically, we first modify the per-slot optimization prob¬ 
lem P3 to the following: 


P3’: 


min 

at 


s.t. 


( Si >t ~ A) 

W ‘ + T.—— 

i££ 


u i, t ) 


where we have defined the perturbation parameter 


where p > 0 is a pre-determined parameter. 

To implement the above iteration, each phase updates the 
controllable power flow li, the net charging and discharg¬ 
ing amount m, and the Lagrange multiplier A,, while the 
substation updates the power How vector f. For information 
exchange, at the (fc + l)-th iteration, the substation sends to 
each phase, and each phase provides m k =ri+l k+1 —u k+1 + -y 
to the substation. The schematic representation of the dis¬ 
tributed implementation is given in Fig. [2] 

Remark: Although the communication network structure 
is the same in both centralized and distributed algorithms 
(i.e., star topology), with the proposed distributed algorithm, 
each phase only needs to provide the update of rri k to the 
substation without revealing the cost functions or other param¬ 
eters. Therefore, the communication load and the information 
revealed by each phase are limited. 

The convergence behavior of the distributed algorithm is 
summarized in the following theorem. The proof follows 
Theorem 2 in |3_2]| and thus is omitted. 

Theorem 2: Assume that the functions Di(-),Ci(-), and 
F(-) are closed, proper, and convex. The sequence 
{i k ,u k j k , A fc } converges to an optimal primal-dual solution 
of P3 with the worst case convergence rate ()(1 //,:). 

IV. Extension to Non-ideal Energy Storage 
In this section, we discuss the algorithm design for non¬ 
ideal energy storage with inefficient charging and discharging. 
This is significant because common storage technologies such 
as batteries can have round-trip efficiency, i.e., ■ rp , ranging 

from 70% to 95% ©. 

The mathematical framework of the algorithm design fol¬ 
lows that of ideal storage. However, due to imperfect charging 
and discharging, the charging and discharging variables uf t 


Pi —5^111111 T Ui : max T 


Vi ( E Sf 
vT 


+ ^C'Uax + ^Uax)- (15) 


The parameter Vi in (fl5l) lies in the interval (0, 

T 7" _A_ max Si, m i n 2lIj.max 

»z,max — 


where 


-PrninVi 


:C' , 


c~Vi Kmin 

V j _ V, 

Note that the definition of Pi in (Ti3l i is similar to that in 
© for ideal storage, except the inclusion of the charging 
and discharging efficiencies. Moreover, if rj+ = = 1, I© 

reduces to ([LB . 

The overall centralized algorithm is summarized in Algo¬ 
rithm [2] where we use the superscript notations ' and * to 
indicate the intermediate solution derived from P3’ and the 
final solution, respectively. To ensure that the final solution 
satisfies constraint ©, in Step 3, we adjust the intermediate 
charging and discharging solutions 


and 


u i t , and the 


controllable power How U. t , so that simultaneous charging and 
discharging cannot happen and the power balance constraint 
© still holds. 

Remarks: Under some conditions, constraint © may au¬ 
tomatically hold by solving P3’, e.g., when the electricity 
price p t is positive and the cost function of the controllable 
flow Ci(-) is increasing. However, if p t can be negative or 
consuming controllable flow costs money, the solution of P3’ 
may not meet constraint © and thus Step 3 in Algorithm [2] 
may be necessary. In addition, if simultaneous charging and 
discharging is allowed in practice, we can simply eliminate 
Step 3 in Algorithm [2] 

The performance of Algorithm [2] is summarized in the 
following theorem. 

Theorem 3: Assume that the system state q t is i.i.d. over 
time and the equipped storage at the phases is not perfectly 
efficient. Under Algorithm |2] the following statements hold. 

1) {a^} is feasible for PI. 
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v 2 

2) w* - w °«* < + e ' 

3) T EfJo 1 E K1 - w ° pt < Eie£ 


'i,max | E[L( ai , 0 )] i 

2Vi ' TVi ' e ’ 


where e=Jf Pmax^i,max ( “f T) j ) H“ 2Z)i ?m ax + Q 

,max* 

Proof: See Appendix [F] ■ 

The results in Theorem [3] parallel those in Theorem Q] 
for ideal storage, with an extra gap e incurred due to the 
adjustment of the intermediate solutions. Furthermore, since 
constraint (0 is ignored in P3’, the problem is convex and 
therefore Algorithm[2]can be implemented distributively using 
a similar ADMM-based algorithm as that in Section UlI-BI 


V. Numerical Results 

In this section, we numerically evaluate the performance 
of the proposed algorithm. In each example, all phases are 
equipped with energy storage. The specific values for the 
system parameters and functions are shown in Table Q] The 
other default setup is as follows: the system state [r t ,q t \ is 
i.i.d. over time; at each time slot, the uncontrollable power 
flows are modeled as independent among phases, and they 
follow the Gaussian distribution A/"(0,4 2 ) truncated within 
[g, min,G,max]; and the electricity price pt is approximated 
to follow the uniform distribution. For ideal storage, at each 
time slot, the control action a t =[l t , u t , f t ] is generated by 
Algorithm [I] and for non-ideal storage, a t =l f . u, 1 . uj”, f t ] 
is generated by Algorithm [2] Both Algorithms are run for 
T = 500 time slots. The control parameter V, is set to l/ (lnax . 
Note that the only difference between the centralized and 
the distributed algorithms is whether the per-slot optimization 
problem in Algorithm |T] (or Algorithm 0 is solved centrally 
by the substation or in distributed fashion by the substation 
and all phases. Therefore, both algorithms lead to the same 
solution to the per-slot optimization problem and thus the same 
time-averaged system costH 

For comparison, we use a greedy algorithm as the bench¬ 
mark, which does not account for the future performance. In 
particular, at each time slot, the greedy algorithm minimizes 
the current system cost subject to all constraints of PI. 
For ideal storage, the greedy algorithm solves the following 
optimization problem in each time slot: 

min w t 
L,u, ,f t 

s.t. ©,(0).®, 

'U’ij Hiax{ Ui^ max? ^i,min 

'U'ij ^ min.max? Si ?max 

For non-ideal storage, at time slot t, an intermediate solution 
is first found by solving the optimization problem 

min Wt s.t. 0.0-0 

lt,u+,u t ,f t 

without the non-simultaneous charging and discharging con¬ 
straint 0. Then, the final solution of the greedy algorithm is 
determined by adjusting the intermediate solution using Step 
3 in Algorithm [2] 

5 Since our focus in this paper is the design of real-time algorithms for 
storage control, we use a stylized system model. Extending our algorithm to 
accommodate more details of a power system and implementing the algorithm 
in a real network using full transient simulation are topics of future work. 


TABLE I 

Default setup of parameters and functions 


Par. 

Setup 

Par. (Fun.) 

Setup 

['T’ijinin? ^i,max] 

[-8, 8] (kW) 

vT - Vi 

i 

/i,max] 

[-5, 5] (kW) 

Ci(x) 

1.5a; 2 

[•^i,min) <Si,max] 

[2,10] (kWh) 

Di(x) 

0.2a: 2 

[PmimPmax] 

[7,12] (cents/kWh) 

F{x) 

10a; 2 

^j,max 

1 (kW) 

N 

3 



Fig. 3. System cost versus energy capacity of storage (si,max = S 2 ,max 

S3,max)- 



Fig. 4. System cost versus energy capacity of storage at Phase 1 (si, max + 
S2,max "t" S3,max — 30 kWh). 


A. Effect of Energy Capacity of Storage 

In this subsection, we consider the effect of energy capacity 
allocation on the system cost. In Fig. 0 we increase the 
values of the energy capacity of all storage units from 8 
kWh to 50 kWh. Note that for the proposed algorithm the 
role of s i max is played through the design of the control 
parameter V,, max in i ] 1 3k and for the greedy algorithm the 
effect of Sj,max is reflected through the upper bound of the 
net charging and discharging variable Mj, t in the optimization 
problem. We see that, as Si, max increases, the system cost 
of the greedy algorithm does not change, while that of the 
proposed algorithm drops with a decreasing slope. The former 
phenomenon could happen when the maximum charging and 
discharging rate w.i, max is relatively small and thus Si, max 
has limited effect on iq, t . The latter observation is consistent 
with the second remark below Theorem |T] that the proposed 
algorithm is asymptotically optimal when Si, max is large. In 
addition, from Theorem [Til we can obtain a lower bound of 

u 2 

the minimum system cost as w opt > w* — 2 V“ • Fig- 






























Fig. 5. System cost versus phase correlation coefficient of uncontrollable 
power flows of Phase 1 and Phase 2. 

0 we also show the curve of this lower bound. In particular, 
when the energy capacity is large, this lower bound is tight. 
However, when the energy capacity is small-to-moderate, this 
lower bound is loose. For the remaining part of the simulation 
section, we study the performance of the proposed algorithm 
when the energy capacity is moderate (e.g., Sj imax = 10 
kWh). Therefore, we have omitted this low bound in the 
remaining figures. Instead, the benchmark greedy algorithm is 
more effective in evaluating the numerical performance of the 
proposed algorithm. Moreover, like the proposed algorithm, 
the performance of the greedy algorithm serves as an upper 
bound of the minimum system cost. 

In Fig. [H we fix the total energy capacity of all storage 
units to 30 kWh (i.e., si, m ax + S 2 ,max + S 3 ,max = 30 kWh) and 
vary the capacity allocation among phases. In particular, we fix 
S 2 ,max at 10 kWh and change si jmax from 5 kWh to 15 kWh. 
Two cases are considered: Case 1, the standard deviation of 
the uncontrollable flow of each phase is 4 kW (default setup); 
Case 2, the standard deviations of the uncontrollable flow of 
phases 1, 2, and 3 are 3 kW, 4 kW, and 5 kW, respectively. 
For both algorithms. Case 2 leads to a smaller system cost in 
general. Moreover, for the greedy algorithm, the system cost 
barely changes with si max . In comparison, for the proposed 
algorithm, the system cost achieves the lowest value when 
the energy capacity is approximately equally allocated. This 
observation is consistent with our conclusion in Proposition 0 

B. Effect of Correlations of Uncontrollable Power Flows 

In this subsection, we examine the effect of both the phase 
and time correlation of the uncontrollable power flows on the 
system cost. In Fig. 0 we assume that at each time slot, 
the uncontrollable flows of Phases 1 and 2 are correlated 
with the phase correlation coefficient, denoted by p\, while 
the uncontrollable flow of Phase 3 is independent of those 
of Phases 1 and 2. We see that, for both algorithms, the 
system cost decreases with p\. This is easy to understand, 
since with a larger p\ the uncontrollable flows of Phases 1 and 
2 are more positively related, which makes phase balancing 
less challenging. In Fig. [6] we additionally assume that the 
uncontrollable flow of Phase 3 is correlated with that of Phase 
1 with the same correlation coefficient p 1 . With the additional 
correlation among phases, the performance gap between the 


Fig. 6. System cost versus phase correlation coefficient of uncontrollable 
power flows of Phase 1 and Phases 2, 3. 



Fig. 7. System cost versus time correlation coefficient of uncontrollable 
power flow at each phase. 

proposed algorithm and the greedy algorithm becomes smaller. 

In Fig. [7J we assume that the uncontrollable flows are 
independent among phases at each time slot, but they are 
temporally correlated with the time correlation coefficient, 
denoted by p 2 . We observe that, for both algorithms, the 
system cost increases with p 2 . This is because at each phase, 
when the uncontrollable flow is more positively correlated, the 
more expensive controllable flow is used for phase balancing 
since the energy state of the storage is close to its range 
limit. Consequently, the proposed algorithm achieves a lower 
system cost when the uncontrollable flow is more negatively 
correlated. 

C. Effect of Charging and Discharging Circuit Parameters 

In Fig. [8] we consider that each phase is equipped with non¬ 
ideal energy storage. The charging and discharging efficiencies 
rft and i]f of each storage are assumed to be the same. We 
see that for both algorithms, the system cost decreases almost 
linearly with the round-trip efficiency. The decreasing trend is 
expected since the storage becomes more efficient with a larger 
value of the round-trip efficiency. In particular, the proposed 
algorithm lends to a lower system cost when the storage is 
reasonably efficient. From the figure, this corresponds to the 
case when the round-trip efficiency is greater than 0.65, which 
includes the range of the round-trip efficiency for most energy 
storage in practice ||6). On the other hand, when the storage is 
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Fig. 8. System cost versus round-trip efficiency of storage at each phase. 



Fig. 9. System cost versus maximum charging/discharging rate of storage 
at each phase. 

highly inefficient, the greedy algorithm is shown to produce a 
better performance. 

In Fig. 0 we vary the value of the maximum charging 
and discharging rate M, ln;ix of all storage from 0.1 kW to 
3 kW. Note that for the greedy algorithm, Uj imax only affects 
the constraints of the net charging and discharging amount, 
and for the proposed algorithm, Uj )in ax additionally affects the 
design of F^max- We see that, the system cost of the greedy 
algorithm decreases with Mi jlnax , while the system cost of the 
proposed algorithm first decreases and then increases. For the 
proposed algorithm, the increasing trend of the system cost 
could be explained using Theorem [T|l, in which the gap to 
the optimal objective value increases with u^max- Moreover, 
from the figure, when Mi,max is less than 1.5 kW, or, when the 
charging duration of the storage is larger than 6.6 time units, 
the proposed algorithm outperforms the greedy one. Since the 
time scale we consider is seconds to minutes, this is the case 
for most batteries as the time scale of their charging duration 
is hours US). To improve the algorithm for large u, lnax is left 
for the future. 

D. Effect of Other System Parameters 

In Fig. DU we show the power flow /i jt between the 
substation and the i-th phase as well as their average, for 
i = 1,2,3. Recall that the purpose of phase balancing is to 
make t of all phases as close as possible. The figure shows 
that the curves of the power flows coincide most of the time. 
To further narrow the gap of these curves, we can increase 



Fig. 10. Time trajectory of power flow fit, for i = 1, 2, 3 . 



Fig. 11. System cost versus number of total phases. 

the coefficient of the loss function F(x) so as to impose more 
penalty for the flow deviation. In return, the system cost would 
be higher. 

Although the three-phase transmission is dominant in prac¬ 
tice, we are interested in finding how the number of phases 
affects the algorithm performance. In Fig. |TT] we increase the 
number of phases N from 2 to 8. For both algorithms, the 
system cost grows linearly with N, which is expected since 
the system cost sums up the costs of all phases. Moreover, as 
N increases, the performance gain of the proposed algorithm 
over the greedy algorithm increases. 

E. Convergence of Distributed Implementation 

In Fig. [12] we examine the convergence behavior of the 
distributed algorithm presented in Section III-B for various 
values of the p parameter. We show the gap between the 
objective value of a per-slot optimization problem at iteration k 
and its minimum objective value over iterations. We see that, 
for all p values, the gap diminishes at a linear convergence 
rate. In particular, setting p = 5 leads to the best convergence 
performance. For a moderate accuracy requirement, the iter¬ 
ative procedure can be stopped within 20 iterations. The fast 
convergence of the proposed algorithm is observed in general 
with appropriate p values, and we omit the curves of other 
per-slot optimization problems for brevity. 
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Fig. 12. Performance gap versus number of iterations for distributed 
algorithm. 


VI. Conclusion and Future Work 

We have investigated the problem of phase balancing with 
energy storage. We have proposed both centralized and dis¬ 
tributed real-time algorithms for ideal energy storage and 
further extended the algorithms to accommodate non-ideal 
energy storage. Moreover, we have conducted extensive sim¬ 
ulation to evaluate the algorithm performance, showing that 
it can substantially outperform a greedy alternative. Our key 
conclusions are that positive correlations between the phases 
make phase balancing easier, and that evenly allocating storage 
over the phases results in the best performance. 

For future work, we are interested in incorporating system 
statistics into the algorithm design to further improve perfor¬ 
mance, and also combining energy storage with traditional 
methods such as feeder reconfiguration for phase balancing. 


Appendix A 

Proof of Relaxation from PI to P2 


Using the energy state update Sj.t+i = s, t + we can 
derive that the left hand side of constraint © equals the 
following: 


lilTL 

T—fo o 


1 

T 


T -1 


E i U i,t] 

t =0 


lim 

T —>oo 


E[Sj,r] 

T 


lim 

T —y oo 


E[s,,q] 

T 


(16) 


In (fl6l) . if .s,.t is always bounded, i.e., constraint © holds, then 
the right hand side of dT6b equals zero and thus constraint © 
is satisfied. Therefore, P2 is a relaxed problem of PI. 


Proof: Based on the definition of L(sij) and the update 
of 


L(sij,+ 1 ) — L(si } t ) 

=7j [(s*,t+i - ft) 2 - (si, t - ft) 2 ] 

— — ft) u M + 7 ) U i,ma.x- 

Using the upper bound above for all phase i £ ft taking 
the conditional expectation, and then adding the term E[ui t |s t ] 
gives the desired upper bound. ■ 


Appendix C 

Proof of Proposition!]] 

Since the per-slot problem P3 includes all constraints of PI 
except the energy state constraint, the key of the feasibility 
proof is to show that the energy state s* it is bounded within 
the interval [s 7 ;. m in- Sj.max]- To this end, we first prove the fol¬ 
lowing lemma which gives a sufficient condition for charging 
or discharging. 

Lemma 2: Under Algorithm!]] f° r i £ ft 

1) if S i>t < ft - ft(?W + £>', m ax + C^max)’ then U i,t = 
^ 2 , max? 

2) if s it t > ft - ft(p m in + D' i min + q min ), then u* t = 

^ 2 ,max* 

Proof: For simplicity of notation, we drop the time index 
t in P3. Using constraint © we replace l 3 with Uj — fj — r 3 in 
the objective of P3. Next we solve P3 through the partitioning 
method by first fixing the optimization variables f and u 3 ,j ft 
i, and then minimizing over u l . The optimization problem with 
respect to u l is as follows. 


min pm + Di(ui) + Ci{m - ft 

Ui 


n) + 


(si - /3j)uj 

ft 


s.t.dEEJ. 


The derivative of the objective above with respect to Ui is 
= p + D'^u^ + C’^Ui -fi-Ti) + Therefore, if Si 

1 ’ df) 

dui 

. Or, if Si is lower bounded as shown 


is upper bounded as shown in Lemma 13 1), we have < 0 


and thus u* t = m 
in Lemma |2]2), we have ^ 


> 0 and thus u* t = —Ui 


Using Lemma[2]above and the definition of ft, we can eas¬ 
ily show the boundedness of the energy state by mathematical 
induction, which is omitted here. 


Appendix B 

An Upper Bound of the Drift-Plus-Cost Function 

In the following lemma, we show that the drift-plus-cost 
function is upper bounded. 

Lemma 1: For all possible decisions and all possible values 
of £ ft at each time slot t, the drift-plus-cost function 
is upper bounded as follows: 

A(s t ) +E[w t |s t ] 

+ E %f + KtM ■ 

2 ££ 


Appendix D 
Proof of Theorem!]] 

We prove Theorem[T|l and Theorem[l]2 together. Denote w 
as the optimal objective value of P2. In the following lemma, 
we show the existence of a special algorithm for P2. 

Lemma 3: For P2, there exists a stationary and randomized 
solution a) that only depends on the system state q t , and at 
the same time satisfies the following conditions: 

E[«4] < ui, Vi, E[u? t ] = 0, Vi £ ftf, 

where the expectations are taken over the randomness of the 
system state and the possible randomness of the actions. 
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The proof of Lemma [3] follows from Theorem 4.5 in Ifl4l 
and is omitted for brevity. Using Lemmas |T] and [3j the drift- 
plus-cost function under Algorithm Q] can be upper bounded 
as follows: 


A(s t ) +E[io*|s t ] 


<EK|s t ] + E [ 

i(z£ 

U i 


u i ,max Si t Pi r s , i 
+ -77-EK,tl S ‘J 


2 K 


K 


i££ 


2V 


< w °p t + y ^ 

9 


i(z£ 


2V 


(17) 

(18) 
(19) 


where is derived based on Lemma |T| and the fact that 
P3 minimizes the upper bound of the drift-plus-cost function, 
® is derived based on Lemma [3] and the fact that the action 
a) is independent of s t , and the inequality in ( IT9t holds since 
P2 is a relaxed problem of PI. 

Taking expectations over s t on both sides of ( I I Ub and 
summing over t £ {0, ■ • • , T — 1} yields 

E E + E E wi < («-•'+E% i > 3 ’- 

ief 1 t= o iee * 


Note that L(s^t) is non-negative. Divide both sides of the 
above inequality by T. After some arrangement, there is 


1) if s i>t < /3i- 

'U'i, max > 

2) if s it t > Pi 


Vii^ + D'. max + 4rC'( J, then u+ = 


- Viipminiii + D[ ■ + ^ C< iD ), then 


t — ^,max- 

Using Lemma Q] and the mathematical induction arguments, 
we can show that Sj it £ [si,min, Sj, max ], V* £ £. Note that the 
adjustment from (uf,u~) to does not change the 

difference uP — u~. Therefore, the resultant energy state s* t 
equals Sj jt and thus is bounded within [si, m m ; Si imax ]. 

2) We prove Theorem[3]2 and Theorem[3]3 together. Similar 
to the ideal case, the relaxed problem of PI can be formed as 
follows. 


1 

P2’: min limsup— > E[tut] 

{a *} ^oo T ^ 

S.t. CD,©,©,©, 

1 T_1 

lim — E [uf 4 . — Uj J = 0, Vi £ £. 

T-t oo T ' ’ ’ 

t=o 


Denote the optimal value of P2’ by w'. We first give the 
following two lemmas, which can be shown similarly to 
Lemmas ID and [3] 

Lemma 5: For all possible decisions and all possible values 
of Si,t,i £ £, in each time slot t, the drift-plus-cost function 
is upper bounded as follows 






t =o 


i££ 


\- 

l 2 


E[L(s ii0 )] 


2Vi 


TVi 


( 20 ) 


which is the conclusion in Theorem [I] 2. Taking limsup on 
both sides of (|2(H ) gives Theorem [T|L 


A(s t ) +E[w t |s t ] < Y, + E[w t |s t ] 

+E Si ’V ; ft ]E [■ u t * _ w *vi s *] • 




( 21 ) 


Appendix E 

Proof of Proposition [2] 

Denote S as the fixed total energy capacity of storage. 
For simplicity of notation, we drop the index i when the 
parameters are the same over all phases or storage units. Given 
the assumptions in Proposition^ the optimization problem can 
be formulated as follows. 

j | j j | | \ ' ^rnay(P max 7 1 ' li r: ^ ^ i'. i 11 ' ^ 111,: ^ ' i ■. i i '' 

o. ^-~*i££ 2(s, max ^min 2?Z max ) 

s t,max v ’ ' 

S-t. ^ = S 

i(z£ 

where we have replaced I4 lnax with its definition in dTH i. It 
can be easily checked that the above problem is a convex 
optimization problem. Using the Karush-Kuhn-Tucker (KKT) 
conditions lf33l . the optimal solutions of Si jlnax must be equal 
over i. 


Appendix F 
Proof of Theorem© 

1) To show the feasibility of {a*}, it suffices to show that 
the resultant energy state i £ £. is bounded. First we give 
sufficient conditions of charging and discharging, which can 
be shown similarly to Lemma© 

Lemma 4: For i £ £, 


Lemma 6 : For P2’, there exists a stationary and randomized 
solution a| that only depends on the system state q t , and at 
the same time satisfies the following conditions: 

E[iu®] < w', Vi, (22) 

E Ht - u i,t) = 0, Vief.t. (23) 

Denote the optimal values of P3’ under a t and the adjusted 
solution a* by g t and , respectively. In the following lemma, 
we characterize the gap between gt and (f. 

Lemma 7: Under the proposed algorithm, at each time t 
we have gt -g t <e, where e= J2 ie£ p ma x«i,max(^ + r h) + 

2-Di,max “t“ Ci^ni ax- 

Proof: Using the objective of P3’, we have 


9t - 9t 


1 


< <* - ^ u lt) + A«t*) + Di{-ur*) + Ci(!l t ) 

Ite V 

~Pt{—r&tt - Pi K,t) - - Df-uf ) - Ci(ii, t ) 

Vi 

< +Pt r li %t + D i( u i,t) + + C i( l U ) 

ite W 

< e. 
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Using Lemmas [5J [6j and [7] the drift-plus-penalty function 
can be further upper bounded as follows. 

A(s t *)+EK*|s t *] 

7 /? of — R- 

< E[«K] + £ H*P + K. - 1 + < 

ies 1 1 

7/? < 2 * R■ 

< EK*I=,1 + £ [-^ + K? - i.T,'K] ] + < 




i«+£^ 


2G£ 


2U 


X—^ ^2 

s«+£^ 


2G£ 


2U; 


The remaining proof is similar to that for Theorem Q] and is 
omitted for brevity. 
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